Reducing delays associated with inserting a checksum into a network message

ABSTRACT

A first partial checksum for the header portion of a TCP header is generated on an intelligent network interface card (INIC) before all the data of the data payload of the TCP message has been transferred to the INIC. A pseudopacket with the first partial checksum and the data is assembled in DRAM on the INIC as the data arrives onto the INIC. When the last portion of the data of the data payload is received onto the INIC, a second partial checksum for the data payload is generated. The pseudopacket is read out of DRAM for transfer to a network. While the pseudopacket is being transferred, the second partial header is combined with the first partial header and the resulting final checksum is inserted into the pseudopacket so that a complete TCP packet with a correct checksum is output from the INIC to the network.

CROSS-REFERENCE TO RELATED APPLICATIONS

[0001] This application claims the benefit under 35 U.S.C. §120 of U.S.patent application Ser. No. 09/464,283, filed Dec. 15, 1999, by LaurenceB. Boucher et al., which in turn claims the benefit under 35 U.S.C. §120of U.S. patent application Ser. No. 09/439,603, filed Nov. 12, 1999, byLaurence B. Boucher et al., which in turn claims the benefit under 35U.S.C. § 119(e) (1) of the Provisional Application Serial No.60/061,809, filed on Oct. 14, 1997. This application also claims thebenefit under 35 U.S.C. §120 of U.S. patent application Ser. No.09/384,792, filed Aug. 27, 1999, which in turn claims the benefit under35 U.S.C. § 119(e) (1) of the Provisional Application Serial No.60/098,296, filed Aug. 27, 1998. This application also claims thebenefit under 35 U.S.C. §120 of U.S. patent application Ser. No.09/067,544, filed Apr. 27, 1998. The subject matter of all theabove-identified patent applications, and of the two above-identifiedprovisional applications, is incorporated by reference herein.

BACKGROUND INFORMATION

[0002]FIG. 1 (Prior Art) is a simplified diagram of a TCP packet. FIG. 2is a simplified diagram of a network interface card (NIC) 100 cardcalled an intelligent network interface card (INIC). One of theoperations the INIC performs is to read data for a TCP packet out ofhost memory 101 on a host computer 102 and to transmit that data as thedata payload of a TCP message onto a network 103.

[0003] A difficulty associated with performing this operation quickly isthat the checksum of the TCP packet is located near the front of thepacket before the data payload. The checksum is a function of all thedata of the data payload. Consequently all the data of the payload mustgenerally be transferred to the INIC 100 before the checksum can begenerated. Consequently, in general, all the data of the payload isreceived onto the INIC card, the checksum 104 is generated, the checksum104 is then combined with the data payload in DRAM 105 to form thecomplete TCP packet 106, and the complete TCP packet 106 is thentransferred from DRAM 105 to the network 107.

[0004]FIG. 2 illustrates this flow of information. Arrow 108 illustratesthe flow of data from host memory 101 and across PCI bus 103 to DRAM 105located on INIC card 100. While the data is being transferred, processor109 on INIC card 100 builds the TCP header 110 in faster SRAM 111. TheTCP header is formed in SRAM 111 rather than DRAM 105 because processor109 needs to perform multiple operations on the header 110 as it isassembled and doing such multiple operations from relatively slow DRAMwould unduly slow down processor 109. When all the data has beenreceived onto the INIC 100, processor 109 is able to determine thechecksum 104. The complete TCP header 110 including the correct checksum104 is at that point residing in SRAM 111. Arrow 112 represents theassembly and writing of the complete header 110 from processor 109 toSRAM 111.

[0005] Once the complete header 110 is assembled, it is transferred fromSRAM 111 to DRAM 105 in a relatively slow write to DRAM 105. Arrow 113illustrated this transfer. Once the complete TCP packet 106 is assembledin DRAM 105, the complete packet 106 is output from DRAM 105 to thenetwork 107. In the example of FIG. 2, this transfer is represented byarrow 114.

[0006] Unfortunately, the writing to DRAM 105 is often a relatively slowprocess and this writing can only begin once all the data has beenreceived onto the INIC card. The result is an undesirable latency in theoutputting of the TCP packet onto the network. A solution is desired.

SUMMARY

[0007] A first partial checksum for the header portion of a TCP headeris generated on an intelligent network interface card (INIC) before allthe data of the data payload of the TCP message has been transferred tothe INIC. A pseudopacket with the first partial checksum and the data isassembled in DRAM on the INIC as the data arrives onto the INIC. Whenthe last portion of the data of the data payload is received onto theINIC, a second partial checksum for the data payload is generated. Thissecond partial checksum is not, however, written into DRAM. Rather, thepseudopacket is read out of DRAM for transfer to the network and whilethe pseudopacket is being transferred the second partial header iscombined with the first partial header such that the resulting final TCPchecksum is inserted into the pseudopacket. The pseudopacket istherefore converted into a complete TCP packet with a correct checksumas it is output from the INIC to the network.

[0008] In this way, the slow write to DRAM of the complete TCP headerafter the payload has already been transferred to DRAM is avoided.Rather than generating the complete TCP checksum and taking the time towrite it into DRAM, the complete TCP checksum is generated on the fly asthe pseudopacket is transferred from DRAM to the network.

[0009] This summary does not purport to define the invention. Theclaims, and not this summary, define the invention.

BRIEF DESCRIPTION OF THE DRAWINGS

[0010]FIG. 1 is a simplified diagram of a TCP packet.

[0011]FIG. 2 is a diagram used in explaining the background of theinvention.

[0012]FIG. 3 is a diagram of an intelligent network interface card(INIC) in accordance with an embodiment of the present invention.

[0013]FIG. 4 is a diagram of a method in accordance with an embodimentof the present invention.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

[0014]FIG. 3 is a diagram of an intelligent network interface card(INIC) 200 in accordance with one embodiment of the present invention.INIC 200 is coupled to host computer 201 via PCI bus 202. For additionalinformation on INIC 200, see U.S. patent application Ser. No.09/464,283, filed Dec. 15, 1999 (the subject matter of which isincorporated herein by reference).

[0015]FIG. 4 is a flow chart that illustrates a method in accordancewith an embodiment of the invention. In step 300, data from host memorythat is to make up a part of the payload of a TCP message is transferredfrom host memory 203 to DRAM 204 via PCI bus 202. Hardware in the pathof this data determines a first checksum CSUM1 on the fly as the datapasses by. This flow of data from host memory 203 to DRAM 204 isindicated on FIG. 3 by arrow 205.

[0016] Although it could be in some situations, in the presentlydescribed example not all the data that will make up the TCP datapayload is present in the same place in host memory 203. Consequently,the flow of data for the data payload from host memory 203 to DRAM 204occurs in multiple different data moves as the various different piecesof the data are located and transferred to DRAM 204.

[0017] In step 301, more of the data that is to make the data payload ofthe TCP message is moved from host memory 203 to DRAM 204. A secondchecksum CSUM2 is generated as the data passes through the data path.This data flow is again represented by arrow 205.

[0018] In this example, the data payload is transferred to DRAM in threepieces. In step 302, the last of the data is moved from host memory 203to DRAM 204 and a third checksum CSUM3 associated with this data isgenerated.

[0019] Processor 206, before this transferring is completed, builds inSRAM 207 the TCP header 208 that is to go on the TCP message. Processor206 does not have all the data for the TCP payload so it cannotdetermine the complete checksum for the TCP message. It does, however,generate a checksum HDR CSUM 209 for the remainder (header portion 216)of the TCP header. This HDR CSUM is a partial checksum. Arrow 210 inFIG. 3 illustrates the building of the pseudoheader 208 (header portion216 and partial checksum HDR CSUM 209) in SRAM 207.

[0020] In step 303, while the data payload is being transferred fromhost memory 203 to DRAM 204 in steps 301-302, the TCP header with thepartial checksum HDR CSUM is moved from SRAM 207 to DRAM 204. Thistransfer is illustrated in FIG. 3 by arrow 211.

[0021] In step 304, after all the data for the data payload has beentransferred such that checksums for all the various pieces of the datapayload have been generated, processor 206 combines those various datachecksums together to form a single checksum for the data payload. Inthis example, there are three data checksums CSUM1, CSUM2 and CSUM 3.These are combined together to make a single data checksum DATA CSUM forthe data payload. Processor 206 then supplies this DATA CSUM to atransmit sequencer 212. For additional details on one particular exampleof a transmit sequencer, see U.S. patent application Ser. No. 09/464,283(the subject matter of which is incorporated herein by reference). Thesupplying of the DATA CSUM to transmit sequencer 212 is illustrated inFIG. 3 by arrow 213. At this point, the data payload is present in oneplace in DRAM 204 in assembled form with the pseudoheader 208 (headerportion 216 and HDR CSUM 209) that was transferred from SRAM 207 to DRAM204 in step 303. This assembly is a pseudopacket (pseudoheader and datapayload). It is complete but for the fact that the header does notcontain a complete checksum but rather contains the partial checksum HDRCSUM 209.

[0022] In step 305, the transmit sequencer 212 begins transferring thepseudopacket out of DRAM 204 for transmission onto a network 214.Network 214 is, in one embodiment, a local area network (LAN). Transmitsequencer 212 combines the DATA CSUM with the HDR CSUM to create a finalchecksum and inserts the final checksum into the pseudopacket as thepseudopacket passes over path 215 from DRAM 204 to network 214. What istransferred onto network 214 is therefore a TCP packet having a correctTCP header with a correct checksum.

[0023] Although the functionality of the INIC is described here as beingcarried out on a separate card, it is to be understood that in someembodiments the functionality of the INIC is carried out on the hostcomputer itself, for example on the motherboard of the host computer.Functionality of the INIC can be incorporated into the host such thatpayload data from host memory does not pass over a bus such as the PCIbus, but rather the INIC functionality is incorporated into the host inthe form of an I/O integrated circuit chip or integrated circuit chipset that is coupled directly to the host memory bus. The I/O integratedcircuit chip has a dedicated hardware interface for networkcommunications. Where the INIC functionality is embodied in such an I/Ointegrated circuit chip, payload data from host memory is transferred tothe network from the host memory by passing through the host's localbus, onto the I/O integrated circuit chip, and from the I/O integratedcircuit chip's network interface port substantially directly to thenetwork (through a physical layer interface device (PHY)) withoutpassing over any expansion card bus.

[0024] Although the present invention has been described in connectionwith certain specific embodiments for instructional purposes, thepresent invention is not limited therefore. The present inventionextends to packet protocols other than the TCP protocol. In someembodiments, the first part of the packet is output from the INIC beforethe final checksum is inserted into the packet. The combining of theDATA CSUM and the HDR CSUM need not be performed by a sequencer and thepseudoheader need not be created by a processor. Other types of hardwareand software can be employed to carry out these functions in certainembodiments. In some embodiments, the pseudoheader is assembled inmemory or registers inside processor 109 rather than in a separatememory such as SRAM 111. Accordingly, various modifications,adaptations, and combinations of various features of the describedembodiments can be practiced without departing from the scope of theinvention as set forth in the claims.

What is claimed is:
 1. A method, comprising: (a) transferring a datapayload from a host memory to a first memory of a network interfacedevice; (b) on the network interface device and before the transferringof (a) is complete creating a pseudoheader and storing the pseudoheaderin a second memory of the network interface device, the pseudoheadercontaining a header portion and a checksum portion, the checksum portionbeing a checksum of the header portion and not a checksum of the datapayload; (c) on the network interface device and before the transferringof (a) is complete transferring the pseudoheader from the second memoryto the first memory; (d) after (c) generating on the network interfacedevice a checksum for the data payload; and (e) reading the pseudoheaderand at least a portion of the data payload from the first memory andcombining the checksum for the header portion with the checksum for thedata portion to generate a final checksum, the final checksum beinginserted into the pseudopacket to form a complete TCP packet, thecomplete TCP packet being output from the network interface device to anetwork.
 2. The method of claim 1 , wherein the first memory is DRAM andwherein the second memory is SRAM.
 3. The method of claim 1 , whereinthe second memory has a faster access time than the first memory.
 4. Themethod of claim 1 , wherien the network interface device is part of ahost computer, the host memory being another part of the host computer.5. An apparatus, comprising: (a) means for transferring a data payloadfrom a host memory to a first memory of a network interface device; (b)means for creating, before the transferring of (a) is complete, apseudoheader and storing the pseudoheader in a second memory of thenetwork interface device, the pseudoheader containing a header portionand a checksum portion, the checksum portion being a checksum of theheader portion and not a checksum of the data payload; (c) means fortransferring, before the transferring of (a) is complete, thepseudoheader from the second memory to the first memory; (d) means forgenerating, after (c), a checksum for the data payload; and (e) meansfor reading the pseudheader and at least a portion of the data payloadfrom the first memory and for combining the checksum for the headerportion with the checksum for the data portion to generate a finalchecksum, the final checksum being inserted into the pseudopacket toform a complete TCP packet, the complete TCP packet being output fromthe network interface device to a network.
 6. The apparatus of claim 5 ,wherein the apparatus comprises a host computer, the network interfacedevice being a part of the host computer, the host memory being anotherpart of the host computer.
 7. The apparatus of claim 5 , wherein themeans for reading includes a sequencer, and wherein the means forcreating includes a processor.